Exercise 21

Author

Alex Smilor

library(dataRetrieval)
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(tidymodels)
── Attaching packages ────────────────────────────────────── tidymodels 1.2.0 ──
✔ broom        1.0.7     ✔ rsample      1.2.1
✔ dials        1.3.0     ✔ tune         1.2.1
✔ infer        1.0.7     ✔ workflows    1.1.4
✔ modeldata    1.4.0     ✔ workflowsets 1.1.0
✔ parsnip      1.2.1     ✔ yardstick    1.3.1
✔ recipes      1.1.0     
── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
✖ scales::discard() masks purrr::discard()
✖ dplyr::filter()   masks stats::filter()
✖ recipes::fixed()  masks stringr::fixed()
✖ dplyr::lag()      masks stats::lag()
✖ yardstick::spec() masks readr::spec()
✖ recipes::step()   masks stats::step()
• Use tidymodels_prefer() to resolve common conflicts.
library(modeltime)
library(tsibble)
Registered S3 method overwritten by 'tsibble':
  method               from 
  as_tibble.grouped_df dplyr

Attaching package: 'tsibble'

The following object is masked from 'package:lubridate':

    interval

The following objects are masked from 'package:base':

    intersect, setdiff, union
library(plotly)

Attaching package: 'plotly'

The following object is masked from 'package:ggplot2':

    last_plot

The following object is masked from 'package:stats':

    filter

The following object is masked from 'package:graphics':

    layout
library(forecast)
Registered S3 method overwritten by 'quantmod':
  method            from
  as.zoo.data.frame zoo 

Attaching package: 'forecast'

The following object is masked from 'package:yardstick':

    accuracy
library(feasts)
Loading required package: fabletools

Attaching package: 'fabletools'

The following object is masked from 'package:yardstick':

    accuracy

The following object is masked from 'package:parsnip':

    null_model

The following objects are masked from 'package:infer':

    generate, hypothesize
# Example: Cache la Poudre River at Mouth (USGS site 06752260)
poudre_flow <- readNWISdv(siteNumber = "06752260",    # Download data from USGS for site 06752260
                          parameterCd = "00060",      # Parameter code 00060 = discharge in cfs)
                          startDate = "2013-01-01",   # Set the start date
                          endDate = "2023-12-31") |>  # Set the end date
  renameNWISColumns() |>                              # Rename columns to standard names (e.g., "Flow", "Date")
  mutate(Date = yearmonth(Date)) |>                   # Convert daily Date values into a year-month format (e.g., "2023 Jan")
  group_by(Date) |>                                   # Group the data by the new monthly Date
  summarise(Flow = mean(Flow))                       # Calculate the average daily flow for each month
GET:https://waterservices.usgs.gov/nwis/dv/?site=06752260&format=waterml%2C1.1&ParameterCd=00060&StatCd=00003&startDT=2013-01-01&endDT=2023-12-31

1. Convert to tsibble

poudre_tbl <- as_tsibble(poudre_flow)
Using `Date` as index variable.

2. Plotting the time series

poudre_tsplot <- poudre_tbl %>% 
  autoplot() +
  geom_line(color = "steelblue") +
  labs(title = "Interactive Flow Time Series", x = "Date", y = "Cubic Feet per Second")
Plot variable not specified, automatically selected `.vars = Flow`
ggplotly(poudre_tsplot)

3. Subseries

gg_subseries(poudre_tbl)+
  labs(title = "Monthly Flow Patterns", y = "Cubic Feet per Second", x = "Year") + 
  theme_minimal()
Plot variable not specified, automatically selected `y = Flow`

This plot shows that the key seasonal difference in flow at this gauge occurs during the late spring and early summer in May and Jun, when flow increases significantly, likely as a result of snowmelt at the beginning of summer. Though there is some variation, the rest of the year’s flow is generally fairly low, though it is slightly higher in the late summer than in the winter and fall. Each subseries represents every individual month.

poudre_decomp <- poudre_tbl %>% 
model(STL(Flow ~ season(window = 12))) %>% 
  components()

autoplot(poudre_decomp) +
  labs(title = "STL Decomposition of Flow", y = "Cubic Feet per Second") +
  theme_minimal()

The plot shows that their is some sort of overall trend in the data and that there is a strong seasonal component of much higher flows in the early summers of most years, though some years only show a relatively small increase in flow during this time. I think that the seasonal components show the general pattern of flow throughout the year, with large peaks in early summer, a smaller peak in the late summer and low flow in the winter. The overall trend shows that flow has been relatively consistent over the last few years, but with a significant decrease compared to the flows from 2014-2016. These years may have been much wetter than the average year. The seasonal component seems to have a slight negative trend, with the summer peaks growing slightly smaller with each year.